The information entropy of quantum mechanical states 



Alexander Stotland 1 , Andrei A. Pomeransky 2 , Eitan Bachmat 3 and Doron Cohen 1 

1 Department of Physics, Ben-Gurion University, Beer-Sheva 84105, Israel 

2 Laboratoire de Physique Theorique, UMR 5152 du CNRS, Universite Paul Sabatier, 31062 
Toulouse Cedex 4, France 

3 Department of Computer Science, Ben-Gurion University, Beer-Sheva 84105, Israel 



PACS. 03.65.Ta - Foundations of quantum mechanics; measurement theory. 
PACS. 03. 67. -a - Quantum information. 



Abstract. - It is well known that a Shannon based definition of information entropy leads 
in the classical case to the Boltzmann entropy. It is tempting to regard the Von Neumann 
entropy as the corresponding quantum mechanical definition. But the latter is problematic from 
quantum information point of view. Consequently we introduce a new definition of entropy 
that reflects the inherent uncertainty of quantum mechanical states. We derive for it an explicit 
expression, and discuss some of its general properties. We distinguish between the minimum 
uncertainty entropy of pure states, and the excess statistical entropy of mixtures. 



The statistical state of a system (p) is specified in classical mechanics using a probability 
function, while in the quantum mechanical case it is specified by a probability matrix. The 
information entropy S[p] is a measure for the amount of extra information which is required 
in order to predict the outcome of a measurement. If no extra information is needed we say 
that the system is in a definite statistical state with S — 0. A classical system can be in 
principle prepared in a definite state. But this is not true for a quantum mechanical system. 
Even if the system is prepared in a pure state, still there is an inherent uncertainty regarding 
the outcome of a general measurement. Therefore the minimum information entropy of a 
quantum mechanical state is larger than zero. 

It is clear that the common von Neumann definition of quantum mechanical entropy does 
not reflect the inherent uncertainty which is associated with quantum mechanical states [1,2]. 
For a pure state it gives S — 0. Let us assume that we prepare two spins in a (pure) singlet 
state. In such a case the von Neumann entropy of a single spin is S = m(2), while the 
system as a whole has 5 = 0. If it were meaningful to give these results an information 
theoretic interpretation, it would be implied that the amount of information which is needed 
to determine the outcome of a measurement of a subsystem is larger than the amount of 
information which is required in order to determine the outcome of a measurement of the 
whole system. This does not make sense. 

Thus we are faced with the need to give a proper definition for the (information) entropy 
of a quantum mechanical state. As in the case of the von Neumann entropy it can be regarded 
as a measure for the lack of purity of a general (mixed) state. But unlike the von Neumann 



2 



entropy it does not give S = for pure states, and does not coincide with the thermodynamic 
entropy in case of a thermal state. 

In this Letter we introduce a Shannon-based definition of quantum mechanical information 
entropy; derive explicit expressions for the calculation of this entropy; and discuss some of its 
properties. For further motivations and review of the traditional definition of entropy in the 
context of quantum computation and quantum information see [3] . 

The statistical state of a classical system, that can be found in one of N possible states r, 
is characterized by the corresponding probabilities p r , with the normalization ^p r = 1. The 
amount of information which is required in order to know what is going to be the outcome 
of a measurement is given by the Shannon formula: S = — J2 r Pr m (Pr)- Note that S = if 
the system is in a definite state, while S = ln(iV) in the worst case of a uniform distribution. 
This definition coincides with the Boltzmann definition of entropy if r are regarded as phase 
space cells. 

In the quantum mechanical case the statistical state of a system is described by a probabil- 
ity matrix p. A measurement requires the specification of a basis of (pure) states \a). Without 
any loss of generality it is convenient to define a given basis by specifying a hermitian operator 
A. We note that in a semiclassical context the basis A can be regarded as a partitioning of 
phase space into cells. The probability to have a as the outcome of a measurement is (a\p\a). 
Therefore the information entropy for such a measurement is 

S\p\A] = -Y / ("\p\a)H(a\p\a)) (1) 

a 

Our notation emphasizes that this is in fact a conditional entropy: one has to specify in 
advance what is the measurement setup. In particular there is a basis 7i in which p is diagonal 
p = diag{p r }. In this basis -S^l-A] attains its minimum value 

S H [p] = S[p\H] = ~^2p r \n(p r ) = -trace(plnp) (2) 

r 

which is known as the von Neumann entropy. We would like to emphasize that from strict 
information theory point of view, the quantity S H [p] can be interpreted as information entropy 
(a la Shanon) only if we assume a-priori knowledge of the preferred basis that makes p diagonal. 
In equilibrium statistical mechanics the interest is in stationary states. This means that p is 
diagonal in the basis that is determined by the Hamiltonian TL. Therefore, if we measure the 
energy of the system, the information entropy is indeed S H [p\. In particular for a canonical 
state p oc cxp(— f3TL) it reduces to the thermodynamic definition of entropy. 

For a pure quantum mechanical state p = \^)(^>\ the von Neumann definition gives 
S H [p] = 0. This seems to imply that a pure quantum mechanical state is lacking a statis- 
tical nature. This is of course not correct. For a general measurement we have uncertainty. 
An absolute definition of an information entropy of a quantum mechanical state should not as- 
sume any special basis. This imply a unique definition of the absolute entropy. Using standard 
information theory argumentation we conclude that ( 1 ) 

S[p] = S[p~\A] = S (N) + F( Pl ,p 2 ,...) ee S (N) + S F [p] (3) 

where the overline indicates averaging over all possible basis sets with uniform measure (no 
preferred basis). We would like to emphasize that the averaging procedure is unique: A choice 

( 1 )If we regard the measurement apparatus as a part of the system, then information theory tells us that 
the total entropy is S to t*i = S[A] + J2 A P(A)S[p\A]. The probability P(A) describes our lack of knowledge 
regarding the state of the apparatus, and S[A] is its corresponding entropy. Quantum mechanics assumes that 
there is no preferred basis. 
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of a basis is like a choice of "direction" in a 2 N — 1 dimensional space (in the case of spin 1 /2 
this direction can be interpreted as the geometrical orientation of our xyz axes in the physical 
space). The second equality in Eq.Q, gives an explicit expression for the absolute entropy 
which we are going to derive below. The result is written as a sum of two terms: The first 
term is the minimum uncertainty entropy of a quantum mechanical state, achieved by a pure 
state, while the second term gives the deviation from purity. We shall call the second term 
excess statistical entropy and will use for it the notation S F [p] . Conceptually it is meaningful 
to ask to what extent S F [p] is correlated with S H [p] . We shall discuss this issue later on. 

Assume that p — diag{p r } is diagonal in some basis H. We can regard all the possible A 
basis sets, as unitary "rotations" of Ti. This means that any a 6 A in the rotated basis is 
obtained from a state r G H. in the preferred basis by an operation U. Consequently 



s = E/(e^ih«)I 2 ) =£/(& 

a \ r / s \ r 



\{r\U\s)\* (4) 



= Nf(^2 Pr \(r\^)\^j =Nf(^2p r (x$+y?) S j = N J™ f(s) P(s)ds (5) 

where we use the notation f(s) — — sln(s). Each averaged |(r|J7|s}| 2 in Eq.Q is equal to 
| (r- 1 ) | 2 averaged over all possible 'J, which leads to Eq.JSJ). It is important to re-emphasize 
that the quantum mechanical "democracy" uniquely defines the measure for this "J average. 
This becomes more transparent if we define x r and y r as the real and imaginary parts of 
= (rl^). The normalization condition is X)r( x r + Ur) = 1- Hence in the final expression 
the average is over all possible directions in a 2iV — 1 dimensional space. In the final expression 
we introduce the notation 

S = J2Pr\^r\ 2 ^J2 P ^ X r +y ^ ^ 

r r 

and its probability distribution is denoted P(s). In what follows we discuss the calculation of 
P(s) and its integral with f(s). 

In case of a maximally mixed state P(s) is delta distributed around s = 1/N, and hence 
f(s) = bx{N)/N. The corresponding information entropy is therefore S[p] = ln(A^) as ex- 
pected. If the state is not maximally mixed then P(s) becomes non trivial. In case of a pure 
state s = l^i | 2 and its distribution is well know [4]: 

P(s) = (N-l)(l-s) N - 2 (7) 

Thus we get an expression for the "minimum uncertainty entropy" which is N dependent: 

N 1 1 

SoW = £ - k « ln(AT) - (l- 7 ) + — (8) 

k=2 

Using the asymptotic approximation in the last equality we see that the difference between the 
S of a maximally mixed state, and that of a pure state, approaches a universal value (1 — 7), 
where 7 is Euler's constant. Using different phrasing, we see that the excess statistical entropy 
is universally bounded: 



SAP] < 1-7 



(9) 
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To get an actual expression for the excess statistical entropy, due to lack of purity, requires 
some more effort. The first stage is to calculate P(s) leading to (see appendix): 



P(s) = (N-l) ]T 

(Pr>s) 



r>(jkr) 



Pr-Pr' 



(Pr - s) 



N~2 



(10) 



The second stage is to calculate the integral of Eq. using 



(p-s) N - 2 S \n(s)ds = r 

y ' W N(N - 1) 



n 

Hp) E t 



k=1 



and then to use the identity (see appendix) 

Y,p? II r^TT = Eft- = 1 



Pr-Pr' 



(11) 



Hence one obtains 



F(pi,P2, 



E 



n 



Pr 



Pr - Pr' 



p r ln(p r ) 



(12) 



This expression is independent of N. Namely, extra zero eigenvalues do not have any effect 
on the result. Some particular cases are of interest. For a mixture of two states we get 



F(pi,P2) = - 

Pi - pi 

For a uniform mixture of n states we get 



-(Pi ln(pi) -p\ ln(p 2 )) 



n 1 

S F [p] = lnH-Er 



k=2 



S[p] = \n(n) + E \ 



(13) 

(14) 
(15) 



n<k<N 



Either S H [p] or S F [p] can serve as a measure for lack of purity. In Fig.l we present results 
of calculation of S F [p] versus S H [p] for a set of representative states, both uniform and non- 
uniform mixtures. We see that there is a very strong correlation between these two (different) 
measures of purity. 

Our definition of entropy has some interesting mathematical properties. One simple prop- 
erty is concavity: Given < A < 1 and two sets of probabilities we have 



F(X Pr + (1 - X)q r ) > XF(p r ) + (1 - X)F(q r ) 



(16) 



This follows from the concavity of f(s) in Eq.JSJ). Concavity and symmetry with respect 
to the variables pi imply that S[p] attains its maximum for maximally mixed states and its 
minimum for pure states. This property is helpful for justifying argumentations that are based 
on "worst case" calculations. Below we list some less trivial properties which are of physical 
interest. 
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Consider a system in a state p, and its subsystem which is in some state a. Technically 
the reduced probability matrix a is obtained from p by tracing over the irrelevant indexes. 
From general information theoretic considerations we expect 

S[a] < S[ P ] (17) 

This means that determination of a state of a subsystem requires less information. As ex- 
plained in the introduction this inequality is violated by the von Neumann entropy. But 
with our definition S[p] > S (MN) > S (2N) > ln(AT) > S[a], where N and MN arc the 
dimensions of a and p respectively. 

Another common physical situation is having a state p = <j A £g) cr B where a A and cr B are 
states of subsystems that were prepared independently^). Obviously we have the property 

S[p\A®B] = S[a A \A] + S[a B \B] (18) 

But for the absolute information entropy we expect 

S[p] > S[a A ] + S[a B ] (19) 

This comes about because there are bases which are not "external tensor product" of „4-basis 
and yB-basis. Thus this inequality reflects the greater uncertainty that we have in the state 
determination of the combined system. Note that if our world were classical, we would get 
an equality, which is the case with the Boltzmann entropy, and in fact also with the von 
Neumann entropy. In order to better establish Ea. l|19|) we can consider a worst case scenario. 
Let N and M be the dimensions of a A and a B respectively. Assume that these states are 
uniform mixtures of n and m states respectively, then p is a uniform mixture of nm states in 
dimension NM. Using Eq. l|15fl and the inequality 

NM mN NM N ro-1 M N-l N M 

El \ 1 \ 1 \ ^ \ ^ 1 \ ^ \ ^ 1 \ ^ 1 \ ^ 1 

k ~ 2-^ k 2.^ ~k ~ 2.^ 2.^ kim-h 2-~t k 2 N-l 2 k 2.^ k 

h — nm+1 k — nm+1 fc — miV+1 ^1— n + lZi— ^2— fc — ri + 1 k — m+1 

we confirm that Ea. I|19fl is indeed satisfied. A particular case of the inequality of Ea. p9|l is 
that the minimum uncertainty entropy satisfies 

So(NM) > So (AT) + 5*0 (M) (20) 

But what about the excess statistical entropy? Our conjecture is that 

S F [p] < S F [a A ] + S F [a B ] (21) 

We can again establish this inequality for uniform mixtures of n and m states in dimensions 
N and M respectively: Using Ea. (|14f> we observe that 

S F [p] - S F [a A ] - S F [a B ] = S (n) + S (m) - S a (nm) 

which is negative by Eq.J^OJ). It is important to realize that Ea. l|2(J|l over compensates the 
inequality Eq. (|21l) leading to Eq. (|19|l . It is well known that for the von Neumann entropy we 
have the general inequality 

S H [p] <S H [a A ]+S H [a B ] (22) 

which holds for any subdivision of a system into two (correlated) subsystems. We already 
observed (Fig. 1) that S F [p] is strongly correlated with S H [p\. Moreover, this correlation is 
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sublinear. It follows that we expect the easier inequality Ea. (|21|l to hold in general, also in 
case of correlated subsystems. 

The effect of quantum measurements on the entropy is of special interest. Let Pi be a 
complete orthogonal set of projectors Pi = 1). The state after a projective measurement 
is g = J^i PipPi- Consequently the state of the system becomes more mixed. This is indeed 
reflected by an increase in the von Neumann entropy of the systems. Also our entropy is a 
measure for lack of purity. Therefore it is reasonable to expect S[a] > S[p]. We were not able 
to prove this assertion. 

Summary: The von Neumann entropy S[p\H] is useful in the thermodynamic context, 
where the interest is a-priori limited to stationary (equilibrium) states. If we want to study 
the growth of entropy during an ergodization process, we may consider S[/o|^], where A is 
a basis (or a "partition" of phase space) that does not commute with TL. See for example 
Ref. [5] where entropy is defined with respect to the position representation. In the latter case 
the entropy of a pure state is in general non-zero. In the present study we have derived an 
explicit expression for the minimum uncertainty entropy Sq(N) of pure states. This can be 
associated with the average over the minimum entropic uncertainty [6] . We also have derived 
an expression for the excess statistical entropy S F [p] of mixtures. The latter can be used as a 
measure for lack of purity of quantum mechanical states, and it is strongly correlated with the 
von Neumann entropy 5 H [/o]- It is bounded from above by (1 — 7), where 7 is Euler's constant. 
The total information entropy S[p], unlike the von Neumann entropy, has properties that do 
make sense from quantum information point of view. 

Appendix: Switching to the variables s r — xf, + yf the definition of P(s) takes the form 

P(s) = /6(s-J2Pr(4 + y 2 r))) = (N-l)\ I dSL.dSN 6(l-J2sr)S(s-J2Pr s r) 

= (N-l)l rd Sl ..ds N [ ^ e (i-E,-)(-+o)+^-£^,)- 
Jo J ( 27r ) 

where the infinitesimal has been introduced to insure convergence once the order of integra- 
tion is changed. Thus after the integration over dsi...dsN one has 

p(s) = (n-iv e^ s TT - - (— {N ~ 1)l V^-fr' 17 — 

r\8) ^ >■)•] (27r)2 e Lliupr + iv + O J 2tt(iuj) n -^ LL pr ,_ Pr 

One can show (see below) that there is no singularity in the integral at u = 0, so one can 
deform the contour of integration in such way that it will go slightly above the point u) = 0. 
Then one can make the integral term by term leading to the final result Ea. p()|l . Namely, if 
p r < s the contour is closed in the upper half plan leading to zero, while if p r > s the contour 
is closed in the lower half plan leading to a non-zero contribution from the u> = pole. 

The above manipulation was based on the observation that the integrand as a whole is 
non-smg ular: The sin gularity of the individual terms cancel upon summation over r. 

This cancellation can be established by expanding the exponent in powers of u>, and using the 
identity 

XX s ~^)™ II " =0 forn<(A-2) 

r>&r) Pr ~ Pr ' 
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Both this identity and also Ea. (|ll|l can be proved by the following procedure: 



r k&r) k F k 



1 -PkZ 



1 



where in the last step one changes z i— » 1/z. 
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Fig.l: The excess information entropy of a mixed quantum mechanical state Sf versus the von Neumann 
entropy S H - The solid line is for uniform mixtures, while the dots are for randomly chosen (nonuniform) 
mixtures. Inset: The information entropy of a pure quantum mechanical state as a function of the Hilbert 
space dimension N. See Eq. JSJ. The dashed line is the asymptotic approximation. 
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